Search CORE

270 research outputs found

What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers

Author: M Hall
M Khabsa
MA Hearst
MT Luong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

This work presents a new, scalable solution to the problem of extracting citation contexts: the textual fragments surrounding citation references. These citation contexts can be used to navigate digital libraries of research papers to help users in deciding what to read. We have developed a prototype system which can retrieve, on-demand, citation contexts from the full text of over 15 million research articles in the Mendeley catalog for a given reference research paper. The evaluation results show that our citation extraction system provides additional functionality over existing tools, has two orders of magnitude faster runtime performance, while providing a 9% improvement in F-measure over the current state-of-the-art

Crossref

Open Research Online (The Open University)

A Review of Object Detection Models based on Convolutional Neural Network

Author: DG Lowe
E Shelhamer
G Wolberg
K Fukushima
M Everingham
MA Hearst
O Russakovsky
PF Felzenszwalb
T-Y Lin
W Li
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2019
Field of study

Convolutional Neural Network (CNN) has become the state-of-the-art for object detection in image task. In this chapter, we have explained different state-of-the-art CNN based object detection models. We have made this review with categorization those detection models according to two different approaches: two-stage approach and one-stage approach. Through this chapter, it has shown advancements in object detection models from R-CNN to latest RefineDet. It has also discussed the model description and training details of each model. Here, we have also drawn a comparison among those models.Comment: 17 pages, 11 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

Author: A Wu
A Wu
AL Barabasi
F Beil
G Erkan
Il-Yeol Song
Illhoi Yoo
J Ghosh
J Kleinberg
LAN Amaral
M Steinbach
MA Hearst
MEJ Newman
MEJ Newman
P Erdos
P Pantel
R Ferrer-Cancho
R Rada
RA Hanneman
S Salton
T Nomato
Xiaohua Hu
Y Zeng
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

DNA Renaturation at the Water-Phenol Interface

Author: 121
125
23
262
27
276
28
289
35
376
38
381
395
41
52
56
7
740
77
82
90
90
98
98
Alberts
Anderson
Anderson
Belhachemi
Berg
Berg
Bernal
Brahms
Braun
Campell
Carri
Carrington
Cech
Chan
Chaperon
Chomczynski
Chow
Christiansen
Craig
Davis
Deamer
DiMarzio
Duckett
Eigen
Eigen
Eisenberg
Escara
Felsenfeld
Fleming
Gennes
Gierer
Gilbert
Gold
Grassmann
Grosberg
Hagenmuller
Hammes
Hearst
Hegner
Helmer
Herschlag
Herskovits
Hill
Jaeger
Joanicot
Joyce
Joyce
Joyce
Katcher
Kauzmann
Kirby
Kirby
Kirby
Kohne
Kowalczykowski
Lahav
Lang
Laurent
Lennard-Jones
Leo
Lerman
Levine
Lohman
Ma
Maier
Maier
Mandell
Manning
Marmur
Marmur
Massie
Meijering
Meuthen
Miller
Morimatsu
Nadassi
Nandi
Neidle
Noller
Oparin
Pace
Pace
Paecht-Horowitz
Papafil
Pelta
Perutz
Piechowska
Piechowska
Pontius
Prigogine
Pusztai
Pörschke
Rhodes
Richter
Riggs
Robertson
Rothman
Saenger
Saito
Saxinger
Schaper
Schürmann
Sevag
Sikorav
Singer
Sinsheimer
Skinner
Smith
Smoluchowski
Studier
Studier
Szostak
Tabor
Tikhonenko
Timmermans
Tor
Tsang
Tsuchihashi
Ts’o
Ts’o
Vesnaver
Wagner
Wang
Ward
Weinstock
Westhof
Wetmur
Wetmur
Widom
Wieder
Winter
Woese
Wolfe
Zaks
Zaks
Zasloff
Zhulina
Ölçer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/11/2003
Field of study

We study DNA adsorption and renaturation in a water-phenol two-phase system, with or without shaking. In very dilute solutions, single-stranded DNA is adsorbed at the interface in a salt-dependent manner. At high salt concentrations the adsorption is irreversible. The adsorption of the single-stranded DNA is specific to phenol and relies on stacking and hydrogen bonding. We establish the interfacial nature of a DNA renaturation at a high salt concentration. In the absence of shaking, this reaction involves an efficient surface diffusion of the single-stranded DNA chains. In the presence of a vigorous shaking, the bimolecular rate of the reaction exceeds the Smoluchowski limit for a three-dimensional diffusion-controlled reaction. DNA renaturation in these conditions is known as the Phenol Emulsion Reassociation Technique or PERT. Our results establish the interfacial nature of PERT. A comparison of this interfacial reaction with other approaches shows that PERT is the most efficient technique and reveals similarities between PERT and the renaturation performed by single-stranded nucleic acid binding proteins. Our results lead to a better understanding of the partitioning of nucleic acids in two-phase systems, and should help design improved extraction procedures for damaged nucleic acids. We present arguments in favor of a role of phenol and water-phenol interface in prebiotic chemistry. The most efficient renaturation reactions (in the presence of condensing agents or with PERT) occur in heterogeneous systems. This reveals the limitations of homogeneous approaches to the biochemistry of nucleic acids. We propose a heterogeneous approach to overcome the limitations of the homogeneous viewpoint

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

HAL-CEA

Cross-cultural adaptation, validation and reliability of the Body Area Scale for Brazilian adolescents

Author: Abreu AM
Aluísio Segurado
Beaton DE
Cash TF
Cash TF
Conti MA
Conti MA
Conti MA
Dounchis JZ
Gordon CC
Grassi-Oliveira R
Gullemin F
Kostanski M
La Taille Y
Lerner RM
Levine MP
Maria Aparecida Conti
Maria do Rosário Dias de Oliveira Latorre
McCabe MP
McCabe MP
McCabe MP
Mendelson BK
Norman Hearst
Pinheiro AP
Reichenheim ME
Richards MH
Rosenblum GD
Sapp SG
Scagliusi FB
Smolak L
Thompson JK
Thompson JK
Thompson MA
Tiggemann M
Wertheim EH
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval

Author: A Leuski
CJ van Rijsbergen
CW Cleverdon
DK Harman
E Voorhees
F Diaz
G Amati
G Erkan
H Abdi
J Lin
J Lin
J Lin
Jimmy Lin
JM Kleinberg
JP Shaffer
L Page
MA Hearst
MD Smucker
O Kurland
P Pirolli
R Mihalcea
WJ Wilbur
WR Hersh
X Huang
X Liu
Y Lin
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these algorithms to related document networks comprised of automatically-generated content-similarity links. Specifically, this work tackles the problem of document retrieval in the biomedical domain, in the context of the PubMed search engine. A series of reranking experiments demonstrate that incorporating evidence extracted from link structure yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

Lexicon induction for interpretable text classification.

Author: A Bandhakavi
CM Bishop
J Clos
J Diederich
H Drucker
M Fernández-Delgado
MA Hearst
GA Miller
A Muhammad
F Pedregosa
G Salton
PJ Stone
W Zhang
H Zou
Publication venue: Springer
Publication date: 02/09/2017
Field of study

The automated classification of text documents is an active research challenge in document-oriented information systems, helping users browse massive amounts of data, detecting likely authors of unsigned work, or analyzing large corpora along predefined dimensions of interest such as sentiment or emotion. Existing approaches to text classification tend toward building black-box algorithms, offering accurate classification at the price of not understanding the rationale behind each algorithmic prediction. Lexicon-based classifiers offer an alternative to black-box classifiers by modeling the classification problem with a trivially interpretable classifier. However, current techniques for lexiconbased document classification limit themselves to using either handcrafted lexicons, which suffer from human bias and are difficult to extend, or automatically generated lexicons, which are induced using pointestimates of some predefined probabilistic measure in the corpus of interest. This paper proposes LexicNet, an alternative way of generating high accuracy classification lexicons offering an optimal generalization power without sacrificing model interpretability. We evaluate our approach on two tasks: stance detection and sentiment classification. We find that our lexicon outperforms baseline lexicon induction approaches as well as several standard text classifiers

Crossref

Open Access Institutional Repository at Robert Gordon University

Biview learning for human posture segmentation from 3D points cloud

Author: B Xie
CC Chang
D Comaniciu
D Grest
D Tao
D Tao
D Tao
D Tao
D Tao
D Tao
Dacheng Tao
Hans A. Kestler
J Cheng
J Wen
J Yu
JR Quinlan
Jun Cheng
MA Hearst
Maoying Qiao
PF Felzenszwalb
PN Belhumeur
R Poppe
S Balla-Arabé
T Xia
TB Moeslund
Wei Bian
X Gao
X Li
X Wang
Y Amit
Y Luo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2014
Field of study

Posture segmentation plays an essential role in human motion analysis. The state-of-the-art method extracts sufficiently high-dimensional features from 3D depth images for each 3D point and learns an efficient body part classifier. However, high-dimensional features are memory-consuming and difficult to handle on large-scale training dataset. In this paper, we propose an efficient two-stage dimension reduction scheme, termed biview learning, to encode two independent views which are depth-difference features (DDF) and relative position features (RPF). Biview learning explores the complementary property of DDF and RPF, and uses two stages to learn a compact yet comprehensive low-dimensional feature space for posture segmentation. In the first stage, discriminative locality alignment (DLA) is applied to the high-dimensional DDF to learn a discriminative low-dimensional representation. In the second stage, canonical correlation analysis (CCA) is used to explore the complementary property of RPF and the dimensionality reduced DDF. Finally, we train a support vector machine (SVM) over the output of CCA. We carefully validate the effectiveness of DLA and CCA utilized in the two-stage scheme on our 3D human points cloud dataset. Experimental results show that the proposed biview learning scheme significantly outperforms the state-of-the-art method for human posture segmentation. © 2014 Qiao et al

CiteSeerX

Crossref

ACU Research Bank

OPUS - University of Technology Sydney

Directory of Open Access Journals

PubMed Central

Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties

Author: A Andreeva
A Gutteridge
AH Elcock
AR Panchenko
B Lee
B Rost
BW Mathews
CA Innis
Cathy H Wu
CH Wu
DK Smith
GJ Bartlett
H Yao
HM Berman
IH Witten
JC Platt
JD Thompson
JS Milton
K Kinoshita
K Sjolander
M Ota
MA Hearst
MJ Ondrechen
Natalia V Petrova
O Lichtarge
P Aloy
PP Wangikar
R Kohavi
R Koradi
R Landgraf
RL Tatusov
S Chakravarty
S Jones
S Parthasarathy
S Zhu
SF Altschul
SJ Campbell
SJ Hubbard
TA Binkowski
W Kabsch
W Tian
WSJ Valdar
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The number of protein sequences deriving from genome sequencing projects is outpacing our knowledge about the function of these proteins. With the gap between experimentally characterized and uncharacterized proteins continuing to widen, it is necessary to develop new computational methods and tools for functional prediction. Knowledge of catalytic sites provides a valuable insight into protein function. Although many computational methods have been developed to predict catalytic residues and active sites, their accuracy remains low, with a significant number of false positives. In this paper, we present a novel method for the prediction of catalytic sites, using a carefully selected, supervised machine learning algorithm coupled with an optimal discriminative set of protein sequence conservation and structural properties. RESULTS: To determine the best machine learning algorithm, 26 classifiers in the WEKA software package were compared using a benchmarking dataset of 79 enzymes with 254 catalytic residues in a 10-fold cross-validation analysis. Each residue of the dataset was represented by a set of 24 residue properties previously shown to be of functional relevance, as well as a label {+1/-1} to indicate catalytic/non-catalytic residue. The best-performing algorithm was the Sequential Minimal Optimization (SMO) algorithm, which is a Support Vector Machine (SVM). The Wrapper Subset Selection algorithm further selected seven of the 24 attributes as an optimal subset of residue properties, with sequence conservation, catalytic propensities of amino acids, and relative position on protein surface being the most important features. CONCLUSION: The SMO algorithm with 7 selected attributes correctly predicted 228 of the 254 catalytic residues, with an overall predictive accuracy of more than 86%. Missing only 10.2% of the catalytic residues, the method captures the fundamental features of catalytic residues and can be used as a "catalytic residue filter" to facilitate experimental identification of catalytic residues for proteins with known structure but unknown function

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Text Mining the History of Medicine

Author: A Henriksson
AR Aronson
C Mihăilă
Carsten Timmermann
D Lopresti
D McClosky
Elizabeth Toon
G Hripcsak
G Schneider
Georgios Kontonatsios
H Moen
H Suominen
J Cohen
J-D Kim
Jacob Carter
John McNaught
JR Firth
K Bontcheva
KB Wagholikar
L Kelly
LM Schriml
Luis M. Rocha
M Miwa
M Miwa
M Ruiz-Casado
M Worboys
MA Hearst
Michael Worboys
N Alnazzawi
O Bodenreider
P Murrieta-Flores
P Thompson
Paul Thompson
R Prasad
RI Dogan
Riza Theresa Batista-Navarro
S Jonnalagadda
S Pyysalo
S Zhang
Sophia Ananiadou
T Hitchcock
TH Tanner
Y Tsuruoka
Y Tsuruoka
Y Tsuruoka
Y Wang
Z Liu
ZS Harris
Ö Uzuner
Ö Uzuner
Ö Uzuner
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 06/01/2016
Field of study

Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

Crossref

Directory of Open Access Journals

Edge Hill University Research Information Repository

PubMed Central

The University of Manchester - Institutional Repository